AITopics | label quality score

Collaborating Authors

label quality score

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

ObjectLab: Automated Diagnosis of Mislabeled Images in Object Detection Data

Tkachenko, Ulyana, Thyagarajan, Aditya, Mueller, Jonas

arXiv.org Artificial IntelligenceSep-2-2023

Such Swapped errors are also common vehicles, object detection remains fairly in many classification datasets (Northcutt et al., 2021a), brittle in part due to annotation errors that plague but the increased complexity of object detection annotation most real-world training datasets. We propose introduces potential for more varied types of label errors ObjectLab, a straightforward algorithm to detect than encountered in classification. We propose an algorithm, diverse errors in object detection labels, including: ObjectLab, that utilizes any trained object detection model overlooked bounding boxes, badly located boxes, to estimate the incorrect labels in such a dataset, regardless and incorrect class label assignments. Object-which of these 3 types of mistake the data annotators made. Lab utilizes any trained object detection model to score the label quality of each image, such that Training and evaluating models with incorrect bounding box mislabeled images can be automatically prioritized annotations is clearly worrisome.

dataset, objectlab, quality score, (13 more...)

arXiv.org Artificial Intelligence

2309.00832

Country: Europe > Switzerland > Zürich > Zürich (0.14)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Estimating label quality and errors in semantic segmentation data via any model

Lad, Vedang, Mueller, Jonas

arXiv.org Artificial IntelligenceJul-11-2023

The labor-intensive annotation process of semantic segmentation datasets is often prone to errors, since humans struggle to label every pixel correctly. We study algorithms to automatically detect such annotation errors, in particular methods to score label quality, such that the images with the lowest scores are least likely to be correctly labeled. This helps prioritize what data to review in order to ensure a high-quality training/evaluation dataset, which is critical in sensitive applications such as medical imaging and autonomous vehicles. Widely applicable, our label quality scores rely on probabilistic predictions from a trained segmentation model -- any model architecture and training procedure can be utilized. Here we study 7 different label quality scoring methods used in conjunction with a DeepLabV3+ or a FPN segmentation model to detect annotation errors in a version of the SYNTHIA dataset. Precision-recall evaluations reveal a score -- the soft-minimum of the model-estimated likelihoods of each pixel's annotated class -- that is particularly effective to identify images that are mislabeled, across multiple types of annotation error.

artificial intelligence, label quality score, machine learning, (13 more...)

arXiv.org Artificial Intelligence

2307.0508

Country: North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)

Genre: Research Report (0.64)

Industry: Health & Medicine > Diagnostic Medicine > Imaging (0.48)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.70)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis (0.46)

Add feedback

Identifying Incorrect Annotations in Multi-Label Classification Data

Thyagarajan, Aditya, Snorrason, Elías, Northcutt, Curtis, Mueller, Jonas

arXiv.org Artificial IntelligenceNov-25-2022

In multi-label classification, each example in a dataset may be annotated as belonging to one or more classes (or none of the classes). Example applications include image (or document) tagging where each possible tag either applies to a particular image (or document) or not. With many possible classes to consider, data annotators are likely to make errors when labeling such data in practice. Here we consider algorithms for finding mislabeled examples in multi-label classification datasets. We propose an extension of the Confident Learning framework to this setting, as well as a label quality score that ranks examples with label errors much higher than those which are correctly labeled. Both approaches can utilize any trained classifier. After demonstrating that our methodology empirically outperforms other algorithms for label error detection, we apply our approach to discover many label errors in the CelebA image tagging dataset.

artificial intelligence, deep learning, machine learning, (12 more...)

arXiv.org Artificial Intelligence

2211.13895

Genre: Research Report (0.56)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.73)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Detecting Label Errors in Token Classification Data

Wang, Wei-Chen, Mueller, Jonas

arXiv.org Artificial IntelligenceOct-8-2022

Mislabeled examples are a common issue in real-world data, particularly for tasks like token classification where many labels must be chosen on a fine-grained basis. Here we consider the task of finding sentences that contain label errors in token classification datasets. We study 11 different straightforward methods that score tokens/sentences based on the predicted class probabilities output by a (any) token classification model (trained via any procedure). In precision-recall evaluations based on real-world label errors in entity recognition data from CoNLL-2003, we identify a simple and effective method that consistently detects those sentences containing label errors when applied with different token classification models.

artificial intelligence, machine learning, natural language, (15 more...)

arXiv.org Artificial Intelligence

2210.0392

Country:

Asia > Japan (0.05)
North America > United States > Minnesota (0.04)
Asia > China (0.04)
(2 more...)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.68)

Add feedback